Skip to content

request interception #930

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Aug 13, 2025
Merged

request interception #930

merged 3 commits into from
Aug 13, 2025

Conversation

sjorsdonkers
Copy link
Contributor

@sjorsdonkers sjorsdonkers commented Aug 7, 2025

Depends on: #922
Also reenables: http_request_start

TODO:

  • Enable: patterns
  • continueWithAuth also Enable.handleAuthRequests
  • fulfillRequest
  • InterceptResponse, also in continueRequest
  • continueRequest.postData
  • continueRequest.headers
  • Common headers and start_callback before intercept
  • Confirm string lifetime of waiting Requests

Example playwright fragment:

const page = await context.newPage();

await page.route('**/*', (route, request) => {
  console.log(`Request ROUTE: ${request.method()} ${request.url()}`);
  route.continue({url: "http://lightpanda.io/", method: "GET"});// or route.abort();
});

await page.goto('/campfire-commerce/');

Copy link
Collaborator

@karlseguin karlseguin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is still in draft.

I like how little impact this has on the http/*.

Do you think it's possible to re-add the http_request_fail and http_request_complete? We could then merge this into nonblocking_libcurl and then into main, as these 2 (+http_request_start) are all that's missing.

@sjorsdonkers
Copy link
Contributor Author

I like how little impact this has on the http/*.

My main concern, and what I got stuck on for while before ignoring it, is dealing with the start_callback. My understanding is that it can modify the headers and the body, but I'm not sure why it happens as a callback as opposed to when creating the Request.

Do you think it's possible to re-add the http_request_fail and http_request_complete? We could then merge this into nonblocking_libcurl and then into main, as these 2 (+http_request_start) are all that's missing.

I'm OK with both adding them here or taking the http_request_start out and moving it into nonblocking_libcurl. We'll see whatever turns out more convenient.

@karlseguin
Copy link
Collaborator

The only thing using the startCallback is XHR (ScriptManager is only using to log). I was going to say we can remove it..but...

XHR does two things with this:
1 - it sets the headers. This we could replace. The Request already allows a header to be passed.

2 - it gets a reference to the transfer, which it needs in case xhr.abort() is called. The only solution I can think for this is that request() would return some opaque identifier (probably the request id) and then we can cancel based on the request_id, instead of the transfer. Requires Client to maintain a list of id => transfer which is a bit unfortunate.

@karlseguin karlseguin force-pushed the nonblocking_libcurl branch from 64f79f2 to 079ce5e Compare August 11, 2025 13:38
karlseguin added a commit that referenced this pull request Aug 12, 2025
This ensures that page.wait won't unblock too early. As-is, this isn't an issue
since active can only be 0 if there are no active OR pending requests. However,
with request interception (#930)
it's possible to have no active requests and no pending requests - from the
http client's point of view - but still have pending-on-intercept requests.

An alternative to this would be to undo these changes, and instead change
Page.wait to be intercept-aware. That is, Page.wait would continue to block on
http activity and scheduled tasks, as well as intercepted requests. However,
since the Page doesn't know anything about CDP right now, and it does know
about the http client, maybe doing this in the client is fine.
intercept continue and abort

feedback

First version of headers, no cookies yet
}

fn parseHeader(header_str: []const u8) ?struct { name: []const u8, value: []const u8 } {
const colon_pos = std.mem.indexOf(u8, header_str, ":") orelse return null;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indexOfScalar removes 1 len check.

@@ -467,12 +467,15 @@ pub const Page = struct {
const owned_url = try self.arena.dupeZ(u8, request_url);
self.url = try URL.parse(owned_url, null);

var headers = try HttpClient.Headers.init();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of internal failure cases will result in this leaking. But I think we can leave it as-is for now.

The difficulty in resource management here is that we need to be careful about allocations up to the point where curl_multi_add_handle is called. After that, it's handled by the perform loop. So a simple errdefer won't do, since you risk double-free depending on where the failure is. A simple fix is to not callperformin makeRequest - making curl_multi_add_handle the last-called function - but I feel like delaying perform until the next tick (especially since there's no reason to think the request can't be sent out immediately) is unnecessary latency.

I want to address it, but, at this point, after we merge everything.

@karlseguin karlseguin marked this pull request as ready for review August 13, 2025 06:44
@karlseguin karlseguin merged commit 2dc09c7 into nonblocking_libcurl Aug 13, 2025
12 of 14 checks passed
@karlseguin karlseguin deleted the request_interception branch August 13, 2025 06:44
@github-actions github-actions bot locked and limited conversation to collaborators Aug 13, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants